Clustering functional data into groups by using projections

A Delaigle; P Hall; T Pham

Journal article

Clustering functional data into groups by using projections

A Delaigle, P Hall, T Pham

Journal of the Royal Statistical Society Series B Statistical Methodology | WILEY | Published : 2019

DOI: 10.1111/rssb.12310

Download PDF

Abstract

We show that, in the functional data context, by appropriately exploiting the functional nature of the data, it is possible to cluster the observations asymptotically perfectly. We demonstrate that this level of performance can sometimes be achieved by the k-means algorithm as long as the data are projected on a carefully chosen finite dimensional space. In general, the notion of an ideal cluster is not clearly defined. We derive our results in the setting where the data come from two populations whose distributions differ at least in terms of means, and where an ideal cluster corresponds to one of these two populations. We propose an iterative algorithm to choose the projection functions in..

View full abstract

University of Melbourne Researchers

Aurore Delaigle Author

Related Projects (2)

NEW NONPARAMETRIC STATISTICAL METHODS FOR IMPERFECTLY OBSERVED DATA

Statistical science today is facing the challenge of having to answer questions about data that are more complex than ever before. Some of ..

Statistical challenges involving indirect data

This project aims to develop statistical methodology for solving contemporary problems involving indirectly observed data whose complexity i..

Grants

Awarded by Australian Research Council

Funding Acknowledgements

This research was supported by grants and fellowships from the Australian Research Council (DP170102434, FT130100098 and FL110100003). The Australian weather data that we used in the paper were assembled by the Australian Bureau of Meteorology. They are available from the Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory: http://rda.ucar.edu/datasets/ds482.1. Bob Dattore is acknowledged for providing the data. The wheat and the octane data are available in Shang and Hyndman's (2018) fds R package. The Berkeley growth data are available in the R fda package of Ramsay et al. (2014). We thank the Joint Editor, the Associate Editor and two reviewers for their helpful comments which helped to improve the paper significantly.